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We investigate a game-theoretic model of a social system where both the rules of the game and the interaction 
structure are shaped by the behavior of the agents. We call this type of model, with several types of feedback 
couplings from the behavior of the agents to their environment, a multiadaptive game. Our model has a complex 
behavior with several regimes of different dynamic behavior accompanied by different network topological 
properties. Some of these regimes are characterized by heterogeneous, hierarchical interaction networks, where 
cooperation and network topology coemerge from the dynamics. 
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Game theory is a language for describing systems in bi- 
ology, economy, and society where the success of an agent 
depends both on its own behavior and the behaviors of oth- 
ers. Perhaps the most important question for gametheoretic re- 
search is to map out the conditions for cooperation to emerge 
among egoistic individuals 1 1 1. To this end, researchers have 
developed a number of different types of models, capturing 
different game-theoretic scenarios. In this Letter, we investi- 
gate a generalization in a new direction, relaxing constraints 
of other models, with feedback effects at different levels to the 
behavior of the agents. 

In most game-theoretical studies, the rules of the game are 
fixed in time, but in real systems there is a feedback from the 
behavior of the agents to the environment and thus to the rules 
of the game. The payoff of a players action in a specific sit- 
uation is parametrized by payoff matrices. A straightforward 
way of modeling feedback from the system to the rules is to 
let the entries of the matrices be variables, dependent on the 
state of the system |2 |. Another feature that often is modeled 
as static when, in reality, it does not have to be, is the contact 
structure. If agents can change their interaction patterns in re- 
sponse to the outcome of the game, then the model will also 
capture the social network dynamics. Such adaptive-network 
models (SjQ can address a wide range of problems: not only 
how interaction determines the evolution of cooperation, but 
also how the interaction patterns themselves emerge. In this 
Letter we investigate a situation where agents can adjust their 
social ties to maximize their payoffs and the collective behav- 
ior of the agents shapes the rules of the game. Our model is 
an adaptive-network model with adaptive payoff matrices — a 
multiadaptive game, for short. 

A classic model for studying the evolution of cooperation in 
spatial game theory is the Nowak-May (NM) game | 6 1 (tech- 
nically speaking, on the border between the archetypical pris- 
oners dilemma and chicken games). It captures a situation 
where at any moment defection has the highest expected pay- 
off, but under some conditions agents can do better in a long 
time perspective by establishing trust and cooperation. An in- 
teraction in the NM game gives the following payoff: zero to 
anyone interacting with a defector (D), one to a cooperator 
(C) meeting another cooperator, and b > 1 to a Z) meeting 



a C. This model has been used to explain the emergence of 
cooperation among egoistic agents in disciplines as diverse as 
political science, economics, and biology 1 1 1, and will be the 
starting point of our work. 

In our model we place Lx L (in this Letter, we use L = 100) 
agents on a square grid with fixed boundary condition. Be- 
sides interacting with n local spatial neighbors (^ = 4, 3, and 2 
for internal, boundary, and comer agents, respectively), each 
agent has one additional link free to optimize its position in 
the interaction network |4| . The rationale behind this arrange- 
ment is that people invest more in their spatially close contacts 
(e.g., family and coworkers) and thus are less likely to break 
these, whereas the long-range edges are more businesslike and 
open to optimization. In sociology this situation goes by the 
name "strength of weak ties" |7 1. 
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FIG. 1 : (Color online) Parameter dependence of the game reflected 
in the temptation and average cooperator density, (a) shows average 
density of cooperators, p, as a function of the initial temptation, bo, 
with Of = 0.1, 3 and 4. The bar represents points averaged over the 
last 500 (of 10^) steps, (b) and (c) correspond to the time evolution of 
p and b, respectively, for different values of bo. (d) shows the diagram 
over the three regions in a-bo space. The curves are averages over 10"^ 
runs. 

In the NM game there is one parameter, the temptation to 
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defect b, representing external conditions of the game ("soci- 
ety" in a social interpretation of the game, "environment" in 
the context of evolutionary biology). In this work, we inves- 
tigate the case when b is higher in a uniformly rich society, 
whereas the motivation to cooperate is higher in a society in 
unrest. Assuming a linear dependence of the temptation to de- 
fect on the prosperity, in our case measured by the density p of 
cooperators in the population, we use a response function [ 2 ] 

bit+\) = b{t) + a{pit)-p% (1) 

where we choose p* (representing a neutral cooperation level 
from the society s perspective) as 1/2 for simplicity. The value 
of a controls the strength of feedback from the environment 
to the game rules. We will, unless otherwise stated, use a = 4. 
We update the state of the system, both strategies and long- 
range linked neighbors of the agents, synchronously. At a time 
step, each agent / acquires payoff ut by playing the Nowak- 
May game with all its local and long-range neighbors. When 
an agent /, updating its strategy, has a higher payoff than its 
neighbors, nothing happens. Otherwise, / adopts the strategy 
of the neighbor j with the highest payoff with a probability 
n(/ j), and simultaneously rewires its free link to the long- 
range neighbor of j. Following Ref. |4 |, we use 

n(/ ^ j) = 1/{1 + cxp[-/3(uj - Ui)]} , (2) 

where controls the noise in the choice of whom to imitate. 
This way of parametrizing noise is further discussed in Ref. 
ISl . We use JS = I in our present study, which is enough to 
create heterogeneous structures but not enough to overshadow 
the strategies as a factor in the dynamics. 

Turning to the numerical results, in Fig. [TJa), we plot the 
average density of cooperators p as a function of bo for three 
values of a. For example, if a = 4 and bo < 2, the system 
converge with certainty to a state with p ^ p* = 1/2. We call 
the region of parameter space with this behavior region I and 
denote the large-/?o border of this region b^. For the a- values 
of Fig.[TJa) bill ^ 2. For bo > bum cooperation vanishes. We 
call this part of a-bo space region III. Between these extremes, 
there is a region II of complex behavior where, depending on 
bo, the cooperation density converges to 1, p* (at least a value 
very close to p*) or with probabilities depending on a and 
bo. With increasing bo, the probability that the system end in 
all-C decreases, and vanishes completely at bum. 

In Figs.[TJb) and (c) we display trajectories of p and b, av- 
eraged over 10"^ runs, for different bo values. These curves 
show the system stabilizing to a steady cooperation level af- 
ter about 50 time steps. These transient oscillations can be 
explained by the adaptive payoff dynamics. Assuming a well- 
mixed case, in which the strategy adoption rate is proportional 
to the relative success of the strategies, one can approximate 
the dynamics by the replicator equation system 

dp ^ rp2(i_p)(i_/,) ifpG[o,i] 

dt \ otherwise 

^ = a(p-p'). (3b) 



The factors p^ and 1 -p of Eq. ( [3a| give the fixed points p = 
and 1. From these equations, we can also understand the oscil- 
latory behavior of Fig.[TJb) and (c). If /? > 1 and p > p*, then b 
will increase and p decrease. This will, after some time, make 
p < p* and thus db/dt negative. If db/dt is negative, then 
dp/dt will eventually become positive. Taken together, this 
explains the cyclic behavior. Such oscillations — growing and 
shrinking C (D) clusters that drive the oscillations in p — can 
be seen with our Java applet of the model lO . For all param- 
eter values we study, the cyclic behavior will either increase 
in amplitude until p reaches a fixed point, or be dampened to 
the fixed point close to p*. The perhaps most interesting ob- 
servation is the onset of the all-C state. As an example, for 
bo = 3.5 in Fig[TJb), p starts increasing again, but it is too 
late — the emergence of a C hub, combined with the fact that b 
is still smaller than 1, drives the system to the all-C state. For 
large bo (> 3.5), p goes toward its final value monotonically, 
while, for smaller values of bo, the convergence is oscillatory. 
For bo > biji, the system hits the fixed points faster than the 
response from the environment can tune the value of b. In an 
extended model where D can appear, by mutation, in an all-C 
state, all-C would not be evolutionary stable. 

In Fig. [TJd), we plot a diagram over the regions of a-bo 
parameter space with distinct dynamic behavior. We identify 
region I as when p at convergence is less than 0.5% from p*, 
i.e., |p -p*| < 0.005 and region III as when the converged p is 
less than 0.005. We note that the boundary value, bm, sepa- 
rating region I from II decreases with an increasing a (biji ^ 2 
for Of < 3 and bm ^ 1 as Of grows towards 11). In region I, 
for all measured values of a, the system relaxes to a steady 
state with p ^ p* and b converges to a stable value. For ex- 
ample, bo = 1.3 gives b(t oo) ^ 2.6 [Figs.[TJb) and (c)]. 
This happens when the feedback in Eq. ([T]) is strong enough 
to balance b. When bo increases beyond bm, the feedback 
from the environment starts affecting b so strongly that the 
system inevitably hits an absorbing state. At a fixed point, 
b grows (if p = 1) or decreases (if p = 0) unboundedly. In 
this situation, as the fixed points in any real system would 
be metastable rather than permanent, b should not be over- 
interpreted. Alternatively, one can limit the temptation by, 
in Eq. ([T]), letting b(t -\- I) = B if b(t -\- I) > 5 and letting 
b(t + I) = -B if b(t + 1) < -B. If B is large enough (B > 4, 
for our parameter values). The conclusions from such a model 
are the same as for the one presented in this Letter, otherwise 
region II can vanish (results not shown). Preliminary studies 
suggest that an all-C state also require a frequent updating of 
the strategies. Now strategies and links are updated equally 
often, but if the link update is 100 times more frequent than 
strategy updating, all-C states almost never happen. If, on the 
other hand, the time scale is skewed in the other direction, 
the conclusions from Eq.[T] remain the same. As a final note 
about Fig. [ijd), we see that buju, separating region II from 
III, increases monotonously with a. That is, cooperation is 
enhanced by the feedback from the environment to the payoff 
matrices. 

Now we turn to the connection between game dynamics 
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FIG. 2: (Color online) Correlations between the strategy and network structure. Circles (squares) correspond to the average density of 
cooperators (defectors) with degree k, pk. (a) is for bo = 13 (region I), (b) is for bo = 3.5 (region II), and (c) is for bo -1 (region III). In panels 
(b) and (c), the exponent of the power-law is 2.7 + 0.1. 



and network structure. In this analysis, we only consider the 
network of long-range links, not the background square grid. 
In Fig. [2] we show pk, the fraction of cooperators or defec- 
tors of a particular degree k in the steady state {t > 500). 
The three different regions show different structure. For re- 
gion I, represented by bo = 1.3 [Fig. [2ja)], pk is larger for 
cooperators than defectors if /: > 3. If /: > 77, all nodes are 
cooperators. Since the final densities of C and D are equal 
in such situation, the high-degree C can protect their neigh- 
bors from imitating defectors, and thus support cooperation. 
For region II, exemplified by bo = 3.5 [Fig.|2jb)] where the 
steady state is all-C (so Pk = for all k), we find that pk has 
a functional form closely described by a power-law with ex- 
ponential cutoff and a decay exponent is about 2.7. Since the 
steady state, in this case, is all-C, the payoff an agent can ac- 
crue will depend linearly on its degree. Consequently, during 
the process of rewiring, the probability of getting new links of 
the agents will be approximately proportional to the degrees 
they already have. In a strictly growing network, "preferential 
attachment" is known to generate a power-law degree distri- 
bution ifTOll . In this case, with networks fixed in size, prefer- 
ential attachment is not enough to explain the degree distribu- 
tion. In such a case, the preferential attachment needs to be 
balanced by an antipreferential deletion of edges 1 11 1 in order 
for a power-law degree distribution to appear. The power-law- 
like degree distribution remains for larger values of bo despite 
the different steady-state values of pk. For a system in the all- 
D state, the rewiring process behaves differently than in the 
all-C case. Since the payoff a defector gets is independent of 
the total number of links it already has, its nonlocal link will 
be rewired randomly to another Z), which generates networks 
with a Poisson degree distribution, as observed in Fig.[2jc). 

Fig. |2] suggests that the coevolution of the contact patterns 
and the payoff matrix, in region II, makes the underlying net- 
work change from its initially random state to a heteroge- 
neous structure. As shown in Fig. [SJa), the cumulative de- 
gree distribution P(k > K) (the probability an observed de- 
gree k is larger than K) depends strongly on bo. Especially 
for bo = 3.5 where, the distribution follows a power-law over 
two decades. For sufficiently large bo, we observe a decay 
of the form P ~ Aexp(-^/^o) + Bexp(-K^/Ki) (Ko,i are 
fitting parameters) — a combination of an exponential and a 
stretched exponential form. The stretched exponential can, as 




FIG. 3: (Color online) Structural properties of network in the steady 
state for different values of bo. (a) displays the cumulative degree 
distribution. The line for bo -1 follows a decay like a sum of an ex- 
ponential and stretched exponential function, (b) shows the cluster- 
ing coefficient C as a function of degree k. The line marks a scaling 
with exponent -1. 



mentioned above, be generated by a (non-linear) preferential 
attachment |[T2l . In Fig. |3jb), we investigate the hierarchi- 
cal features of the steady state networks in greater detail. It 
has been argued that a characteristic feature of hierarchical 
networks is that the clustering coefficient (the fraction of pos- 
sible triangles a node is member of with given the degree) is 
inversely proportional to degree 1 131 . This is indeed what we 
observe for large bo values. 

In conclusion, we have studied a game-theoretical model 
with feedback from the behavior of the agents to the rules 
of the game, via the payoff matrix, and an active optimiza- 
tion of both the contact structure between the agents and their 
strategies. With respect to the average cooperation density, 
the model is a non-equilibrium model. This makes the initial 
temptation value bo a crucial model parameter. We identify 
three regions of distinct dynamic behavior. In region I, the 
average cooperator density relaxes to a stable level through 
damped oscillations; in region III the systems reaches an all- 
defect state. For intermediate /^o-values (region II), the system 
ends at one of three fixed points, 0, p* or 1, with parameter- 
dependent probabilities. For some parameter values in this re- 
gion, the system will almost certainly reach an all-Cooperator 
state. The all-cooperator state is absorbing, but if one extends 
this model to a non-equilibrium model, it would not be stable 
to mutations in u. In the all-C state, the network has the most 
heterogeneous degree distribution, and also a clear C ~ 1 
scaling of the clustering coefficient. Ref. 1 13 1 argues that this 
feature is indicative of a hierarchical organization of the sys- 
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tern. This is in contrast to usual explanations of social hier- 
archies as resulting from external factors such as age or fit- 
ness IWl or internal heterogeneities. The latter case is true 
also for our model in the limit of no environmental feedback, 
in which case it reduces to the model of Ref. |4 |. But also 
the network dynamics is needed for the hierarchical topology 
and cooperation to co-emerge. If there is no network dynam- 
ics, the cooperation stabilizes at some intermediate p- value 
and does not reach the all-C state. In this case a power-law 
degree distribution emerges for intermediate cooperator lev- 
els. In other game-theoretic situations, hierarchical organiza- 
tion has sometimes proven to support cooperation 1 15], some- 
times destabilizing it 1 16 |. The source of the co-emergence of 
cooperation and a hierarchical topology in our model comes 
from the cooperators being stabilized by high-degree nodes, 
while there is no similar eff'ect for the defectors. A similar 
positive feedback mechanisms between degree and payoff' of 
cooperators drive the emergence of cooperation in the model 
of Ref. 1 17 |. This model diff'ers from ours in that the payoff' 
matrix is fixed and not a function of the state of the system. 

In summary, our work shows a new possible mechanism for 
the coemergence of hierarchical structures and cooperation. 
We foresee more studies of games in ffexible settings where 
the game itself determines its rules and the player can choose 
when 1 18] and with whom | 3 1 to interact from its strategy. 
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